Handling imbalanced datasets through Optimum-Path Forest

نویسندگان

چکیده

In the last decade, machine learning-based approaches became capable of performing a wide range complex tasks sometimes better than humans, demanding fraction time. Such an advance is partially due to exponential growth in amount data available, which makes it possible extract trustworthy real-world information from them. However, such generally imbalanced since some phenomena are more likely others. behavior yields considerable influence on learning model's performance becomes biased frequent receives. Despite methods, graph-based approach has attracted notoriety outstanding over many applications, i.e., Optimum-Path Forest (OPF). this paper, we propose three OPF-based strategies deal with imbalance problem: $\text{O}^2$PF and OPF-US, novel for oversampling undersampling, respectively, as well hybrid strategy combining both approaches. The paper also introduces set variants concerning mentioned above. Results compared against several state-of-the-art techniques public private datasets confirm robustness proposed

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Handling imbalanced datasets: A review

Learning classifiers from imbalanced or skewed datasets is an important topic, arising very often in practice in classification problems. In such problems, almost all the instances are labelled as one class, while far fewer instances are labelled as the other class, usually the more important class. It is obvious that traditional classifiers seeking an accurate performance over a full range of ...

متن کامل

Efficient supervised optimum-path forest classification for large datasets

Data acquisition technologies can provide large datasets with millions of samples for statistical analysis. This creates a tremendous challenge for pattern recognition techniques, which need to be more efficient without loosing their effectiveness. We have tried to circumvent the problem by reducing it into the fast computation of an optimum-path forest (OPF) in a graph derived from the trainin...

متن کامل

Fast Petroleum Well Drilling Monitoring Through Optimum-Path Forest

Automatic inspection of petroleum well drilling has became paramount in the last years, mainly because of the crucial importance of saving time and operations during the drilling process in order to avoid some problems, such as the collapse of the well borehole walls. In this paper, we extended another work by proposing a fast petroleum well drilling monitoring through a modified version of the...

متن کامل

Supervised Pattern Classification Using Optimum-Path Forest

We present a graph-based framework for pattern recognition, called Optimum-Path Forest (OPF), and describe one of its classifiers developed for the supervised learning case. This classifier does not require parameters and can handle some overlapping among multiple classes with arbitrary shapes. The method reduces the pattern recognition problem into the computation of an optimum-path forest in ...

متن کامل

Land Use Classification Using Optimum-Path Forest

It was introduced in this paper the Optimum-Path Forest for land use classification aiming a better environmental management, using images obtained from CBERS 2B CCD satellite covering the area of the Rio das Pedras watershed, Itatinga City, São Paulo State, Brazil. We also compared the Optimum-Path Forest algorithm with the well known supervised classifiers: Artificial Neural Networks using Mu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Knowledge Based Systems

سال: 2022

ISSN: ['1872-7409', '0950-7051']

DOI: https://doi.org/10.1016/j.knosys.2022.108445